"Convex Until Proven Guilty": Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions

نویسندگان

  • Yair Carmon
  • John C. Duchi
  • Oliver Hinder
  • Aaron Sidford
چکیده

We develop and analyze a variant of Nesterov’s accelerated gradient descent (AGD) for minimization of smooth non-convex functions. We prove that one of two cases occurs: either our AGD variant converges quickly, as if the function was convex, or we produce a certificate that the function is “guilty” of being non-convex. This non-convexity certificate allows us to exploit negative curvature and obtain deterministic, dimension-free acceleration of convergence for non-convex functions. For a function f with Lipschitz continuous gradient and Hessian, we compute a point x with krf(x)k  ✏ in O(✏ 7/4 log(1/✏)) gradient and function evaluations. Assuming additionally that the third derivative is Lipschitz, we require only O(✏ 5/3 log(1/✏)) evaluations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Regret bounds for Non Convex Quadratic Losses Online Learning over Reproducing Kernel Hilbert Spaces

We present several online algorithms with dimension-free regret bounds for general nonconvex quadratic losses by viewing them as functions in Reproducing Hilbert Kernel Spaces. In our work we adapt the Online Gradient Descent, Follow the Regularized Leader and the Conditional Gradient method meta algorithms for RKHS spaces and provide regret bounds in this setting. By analyzing them as algorith...

متن کامل

Accelerating Asynchronous Algorithms for Convex Optimization by Momentum Compensation

Asynchronous algorithms have attracted much attention recently due to the crucial demands on solving large-scale optimization problems. However, the accelerated versions of asynchronous algorithms are rarely studied. In this paper, we propose the “momentum compensation” technique to accelerate asynchronous algorithms for convex problems. Specifically, we first accelerate the plain Asynchronous ...

متن کامل

Convergence Rate of Sign Stochastic Gradient Descent for Non-convex Functions

The sign stochastic gradient descent method (signSGD) utilises only the sign of the stochastic gradient in its updates. For deep networks, this one-bit quantisation has surprisingly little impact on convergence speed or generalisation performance compared to SGD. Since signSGD is effectively compressing the gradients, it is very relevant for distributed optimisation where gradients need to be a...

متن کامل

On Steepest Descent Algorithms for Discrete Convex Functions

This paper investigates the complexity of steepest descent algorithms for two classes of discrete convex functions, M-convex functions and L-convex functions. Simple tie-breaking rules yield complexity bounds that are polynomials in the dimension of the variables and the size of the effective domain. Combination of the present results with a standard scaling approach leads to an efficient algor...

متن کامل

Distributed Optimization of Convex Sum of Non-Convex Functions

We present a distributed solution to optimizing a convex function composed of several nonconvex functions. Each non-convex function is privately stored with an agent while the agents communicate with neighbors to form a network. We show that coupled consensus and projected gradient descent algorithm proposed in [1] can optimize convex sum of non-convex functions under an additional assumption o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017